Getting Started

For many of the basics in RMarkdown the RStudio introduction is very worth while. Almost every R visualization output is compatible with it.

Table of Contents:

A floating table of contents for HTML documents is preferred for longer analysis. This allows the reader to go between sections easily without having to scroll back to the top of the document. The template also includes some slight changes to the markdown output format to make the most use of the reader’s screen, but recognize that floating TOC is neither Mobile friendly (nobody should really be viewing on a phone anyway) nor very good on screens smaller than 1000 pixels in width.

A toc_depth of two is preferred, this means that any section that starts with either one or two #’s will appear. Other markdown headers will function, they are just omitted from the table of contents.

Common Standards

Loading Data

Database Connections

require(DBI)

wh_host <- Sys.getenv('WH_HOST')
wh_db <- Sys.getenv('WH_DB')
wh_user <- Sys.getenv('WH_USER')
wh_pass <- Sys.getenv('WH_PASS')

# Connect to the CountyStat DataWarehouse
wh_con <- dbConnect(odbc::odbc(), driver = "{ODBC Driver 17 for SQL Server}", server = wh_host, database = wh_db, UID = wh_user, pwd = wh_pass, Trusted_Connection= "yes")

Always keep your credentials for the DataWarehouse in an .env file. An example is in this repo, but in normal circumstances this file should be in the .gitignore file.

To use DBI check out their documentation, but for the most part you will be using either the dbGetQuery() or dbReadTable() function.

Reading Flat Files

For flat files, readr will do well for most CSV’s and other basic file formats, readxl is best for excel documents, jsonlite is ideal for JSON documents, and xml2 will do if you have the unfortunate circumstance of dealing with that format.

require(readr)
require(readxl)
require(jsonlite)
require(xml2)

Spatial Data

The easiest method for reading data and mapping in either ggplot2, ggmap or leaflet is the sf package. When looking for a spatial dataset, the first option is always what is available from the County ESRI team on their GIS Open Data site. If you can’t find what you are looking for, and it is something you think the County should have access to email their team at GISHelp@AlleghenyCounty.US.

require(sf)

munis <- read_sf('https://opendata.arcgis.com/datasets/9de0e9c07af04e638dbc9cb9070962c2_0.geojson')

plot(munis)

Census Data

The easiest way to use the Census API in R is with the tidycensus package. First you have to sign-up for an api key, but its a free and easy process.

require(tidycensus)

census_api_key(Sys.getenv("census"))

v19 <- load_variables(2019, "acs5", cache = TRUE)

alco_muni_pop <- get_acs("county subdivision", state = 'PA', county = 'Allegheny', year = 2019, variables = 'B01003_001', geometry = T)
## 
  |                                                                            
  |                                                                      |   0%
  |                                                                            
  |=                                                                     |   1%
  |                                                                            
  |==                                                                    |   2%
  |                                                                            
  |==                                                                    |   3%
  |                                                                            
  |===                                                                   |   4%
  |                                                                            
  |===                                                                   |   5%
  |                                                                            
  |====                                                                  |   6%
  |                                                                            
  |=====                                                                 |   7%
  |                                                                            
  |=====                                                                 |   8%
  |                                                                            
  |======                                                                |   9%
  |                                                                            
  |=======                                                               |  10%
  |                                                                            
  |========                                                              |  12%
  |                                                                            
  |=========                                                             |  12%
  |                                                                            
  |=========                                                             |  13%
  |                                                                            
  |==========                                                            |  14%
  |                                                                            
  |===========                                                           |  15%
  |                                                                            
  |===========                                                           |  16%
  |                                                                            
  |============                                                          |  17%
  |                                                                            
  |=============                                                         |  18%
  |                                                                            
  |=============                                                         |  19%
  |                                                                            
  |==============                                                        |  20%
  |                                                                            
  |===============                                                       |  21%
  |                                                                            
  |================                                                      |  23%
  |                                                                            
  |=================                                                     |  24%
  |                                                                            
  |==================                                                    |  26%
  |                                                                            
  |===================                                                   |  27%
  |                                                                            
  |====================                                                  |  28%
  |                                                                            
  |====================                                                  |  29%
  |                                                                            
  |=====================                                                 |  30%
  |                                                                            
  |======================                                                |  31%
  |                                                                            
  |======================                                                |  32%
  |                                                                            
  |=======================                                               |  33%
  |                                                                            
  |========================                                              |  34%
  |                                                                            
  |========================                                              |  35%
  |                                                                            
  |=========================                                             |  35%
  |                                                                            
  |==========================                                            |  37%
  |                                                                            
  |===========================                                           |  38%
  |                                                                            
  |============================                                          |  40%
  |                                                                            
  |=============================                                         |  41%
  |                                                                            
  |=============================                                         |  42%
  |                                                                            
  |==============================                                        |  43%
  |                                                                            
  |===============================                                       |  44%
  |                                                                            
  |===============================                                       |  45%
  |                                                                            
  |================================                                      |  46%
  |                                                                            
  |=================================                                     |  47%
  |                                                                            
  |=================================                                     |  48%
  |                                                                            
  |==================================                                    |  49%
  |                                                                            
  |===================================                                   |  49%
  |                                                                            
  |===================================                                   |  51%
  |                                                                            
  |====================================                                  |  51%
  |                                                                            
  |=====================================                                 |  52%
  |                                                                            
  |=====================================                                 |  54%
  |                                                                            
  |======================================                                |  54%
  |                                                                            
  |=======================================                               |  55%
  |                                                                            
  |=======================================                               |  56%
  |                                                                            
  |========================================                              |  57%
  |                                                                            
  |=========================================                             |  58%
  |                                                                            
  |=========================================                             |  59%
  |                                                                            
  |==========================================                            |  60%
  |                                                                            
  |==========================================                            |  61%
  |                                                                            
  |===========================================                           |  62%
  |                                                                            
  |============================================                          |  63%
  |                                                                            
  |=============================================                         |  65%
  |                                                                            
  |==============================================                        |  65%
  |                                                                            
  |==============================================                        |  66%
  |                                                                            
  |===============================================                       |  68%
  |                                                                            
  |================================================                      |  68%
  |                                                                            
  |================================================                      |  69%
  |                                                                            
  |=================================================                     |  70%
  |                                                                            
  |==================================================                    |  71%
  |                                                                            
  |===================================================                   |  72%
  |                                                                            
  |===================================================                   |  73%
  |                                                                            
  |====================================================                  |  74%
  |                                                                            
  |====================================================                  |  75%
  |                                                                            
  |=====================================================                 |  76%
  |                                                                            
  |======================================================                |  77%
  |                                                                            
  |=======================================================               |  79%
  |                                                                            
  |========================================================              |  80%
  |                                                                            
  |=========================================================             |  82%
  |                                                                            
  |==========================================================            |  83%
  |                                                                            
  |===========================================================           |  84%
  |                                                                            
  |============================================================          |  85%
  |                                                                            
  |============================================================          |  86%
  |                                                                            
  |=============================================================         |  87%
  |                                                                            
  |==============================================================        |  88%
  |                                                                            
  |===============================================================       |  90%
  |                                                                            
  |================================================================      |  91%
  |                                                                            
  |=================================================================     |  93%
  |                                                                            
  |==================================================================    |  94%
  |                                                                            
  |===================================================================   |  95%
  |                                                                            
  |===================================================================   |  96%
  |                                                                            
  |====================================================================  |  97%
  |                                                                            
  |====================================================================  |  98%
  |                                                                            
  |===================================================================== |  99%
  |                                                                            
  |======================================================================| 100%

Charts and Graphs

The easiest package for quickly creating charts and graphs in R is ggplot2. Once of the best resources for ggplot2 visualizations is the Topic 50 ggplot2 Visualizations from Selva Prabhakaran. This includes full code and examples using you can access directly from R. This site also has a good introduction to ggplot2 if you find the one from RStudio insufficient.

require(ggplot2)

g <- ggplot(mpg, aes(manufacturer)) + geom_bar(aes(fill=class), width = 0.5) + 
  theme(axis.text.x = element_text(angle=65, vjust=0.6)) + 
  labs(title="Histogram on Categorical Variable", 
       subtitle="Manufacturer across Vehicle Classes",
       caption = "EPA Fuel Economy 1999-2008") 

g

If you are using this document as your template, that means you can create interactive visuals. The easiest way to do this with ggplot is using the ggplotly function from the plotly library.

require(plotly)

ggplotly(g) %>%
  layout(annotations = list(x = 1,
                            y = -0.28, 
                            text = 'EPA Fuel Economy 1999-2008', 
                            showarrow = F,
                            xref='paper',
                            yref='paper', 
                            xanchor='right', 
                            yanchor='auto', 
                            xshift=0, 
                            yshift=0,
                            font=list(size=12, color = "darkgray")),
         title = list(text = paste0('Histogram on Categorical Variable',
                                    '<br>',
                                    '<sup>',
                                    'Manufacturer across Vehicle Classes',
                                    '</sup>')))

Note that not all of the labels translate, subtitles and captions are lost and have to be added through other methods.

Colors

A great resource for color palettes is colorbrewer, which is easy to implement in ggplot2.

g + scale_fill_brewer(palette = "Spectral")

Tables

The best functions for tables are DT() and kable(). The best option will depend on the number of rows and situation you find yourself in. Both of these functions allow for a wide amount of customization and formating. CountyStat has a standard table format to start with, however Analysts should feel free to check out the documentation for both packages especially when looking to format the rows to highlight certain values or increase readability, or improve other functions.

DT

DT has extensive documentation showing its features. Below is the standard format for a CountyStat DT table. One of the biggest benefits to this package is the horizontal scroll option and download button feature.

require(DT)

datatable(iris, 
          caption = "Fisher's or Anderson's", 
          rownames = F,
          escape = F,
          extensions = 'Buttons', 
          options = list(
            dom = 'Bfrtip',
            buttons = c('copy', 'csv', 'excel', 'pdf', 'print'),
            # list(scrollX = TRUE),
            initComplete = JS(
              "function(settings, json) {",
              "$(this.api().table().header()).css({'background-color': '#008080', 'color': '#fff'});",
              "}")
            )
          )

kable w/ kableExtra

Use the kableExtra package to format your kable() tables. Like the DT() example above a standard CountyStat format is available below, but Analysts should feel free to use the wide variety of features the kableExtra library provides to improve readability and convey a message.

require(kableExtra)

iris %>%
  sample_n(15) %>% # Select 15 random rows
  kbl(caption = "Fisher's or Anderson's") %>%
  kable_styling(font_size = 12) %>%
  row_spec(0, bold = T, background = "#008080", color = "white")
Fisher’s or Anderson’s
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
4.9 3.1 1.5 0.1 setosa
6.5 3.0 5.8 2.2 virginica
7.2 3.2 6.0 1.8 virginica
6.3 3.3 6.0 2.5 virginica
5.1 3.5 1.4 0.2 setosa
6.5 2.8 4.6 1.5 versicolor
5.5 2.4 3.7 1.0 versicolor
5.0 3.6 1.4 0.2 setosa
6.1 2.8 4.0 1.3 versicolor
6.7 3.1 4.7 1.5 versicolor
6.6 3.0 4.4 1.4 versicolor
5.3 3.7 1.5 0.2 setosa
4.9 2.4 3.3 1.0 versicolor
6.3 2.9 5.6 1.8 virginica
5.1 3.7 1.5 0.4 setosa

Mapping

leaflet

The most versatile mapping application for HTML Rmarkdown is leaflet. It can map polygons, points, lines and more. The CountyStat team typically will use the Stamen.Toner base map, as it provides the least conflict with color schemes. As with ggplot2 the colorbrewer palettes are integrated and highly recommended.

require(leaflet)

alco_muni_pop <- alco_muni_pop %>%
  mutate(NAME = gsub(", Allegheny County, Pennsylvania", "", NAME),
         NAME = tools::toTitleCase(NAME))

pal <- colorBin("YlOrRd", bins = c(0, 1000, 5000, 10000, 25000, 50000, 100000, max(alco_muni_pop$estimate)), alco_muni_pop$estimate)

leaflet(alco_muni_pop) %>% 
  addProviderTiles(providers$Stamen.Toner) %>%
  addPolygons(color = "#444444", weight = 1, smoothFactor = 0.5,
    opacity = 1.0, fillOpacity = 0.5,
    popup = ~paste("<b>Name:</b>", NAME,
                   "<br><b>Est. Population:</b>", prettyNum(estimate, big.mark = ",")),
    fillColor = ~pal(estimate),
    highlightOptions = highlightOptions(color = "white", weight = 2,
      bringToFront = TRUE)) %>%
  addLegend("bottomright", pal = pal, values = ~estimate,
    title = "Est. Pop (2019 ACS)",
    opacity = 1
  )

ggplot2 map

If you don’t necessarily need interactivity or a base map you can just use ggplot2.

require(ggthemes)

m <- ggplot(data = alco_muni_pop, aes(fill = estimate)) +
  geom_sf() +
  theme_map() +
  labs(title = "Population by Municipality",
       subtitle = "Allegheny County",
       caption = "ACS 2019 Estimate") +
  scale_fill_gradient(low = '#efedf5', high = '#3f007d')

m

### ggplotly map

To add interactivity you can use plotly on a typical geom_sf, but in most cases leaflet is a better option to achieve this effect.

ggplotly(m)

ggmap

ggmap works if you are not trying to have interactive elements but want to add a base map for context. As with leaflet the Stamen.Toner base map is suggested.

require(ggmap)

# Format 
bbox <- st_bbox(alco_muni_pop)
bbox_trans <- c(left = bbox[[1]], bottom = bbox[[2]], right = bbox[[3]], top = bbox[[4]])

get_stamenmap(bbox_trans, maptype = "toner-lite") %>% 
  ggmap() +
  theme_map() + 
  labs(title = "Population by Municipality",
       subtitle = "Allegheny County",
       caption = "ACS 2019 Estimate") +
  geom_sf(data = alco_muni_pop, aes(fill = estimate), inherit.aes = FALSE, alpha = .75) +
  scale_fill_gradient(low = '#efedf5', high = '#3f007d')